Motion and appearance based Multi-Task Learning network for autonomous driving
ثبت نشده
چکیده
Autonomous driving has various visual perception tasks such as object detection, 1 motion detection, depth estimation and flow estimation. Multi-task learning (MTL) 2 has been successfully used for jointly estimating some of these tasks. Previous 3 work was focused on utilizing appearance cues. In this paper, we address the gap 4 of incorporating motion cues in a multi-task learning system. We propose a novel 5 two-stream architecture for joint learning of object detection, road segmentation 6 and motion segmentation. We designed three different versions of our network to 7 establish systematic comparison. We show that the joint training of tasks signifi8 cantly improves accuracy compared to training them independently even with a 9 relatively smaller amount of annotated samples for motion segmentation. To enable 10 joint training, we extended KITTI object detection dataset to include moving/static 11 annotations of the vehicles. An extension of this new dataset named KITTI MOD 12 is made publicly available via the official KITTI benchmark website . Our baseline 13 network outperforms MPNet which is a state of the art for single stream CNN-based 14 motion detection. The proposed two-stream architecture improves the mAP score 15 by 21.5% in KITTI MOD. We also evaluated our algorithm on the non-automotive 16 DAVIS dataset and obtained accuracy close to the state-of-the-art performance. 17 The proposed network runs at 8 fps on a Titan X GPU using a two-stream VGG16 18 encoder. Demonstration of the work is provided in. 19
منابع مشابه
Motion and Appearance Based Multi-Task Learning Network for Autonomous Driving
Autonomous driving has various visual perception tasks such as object detection, motion detection, depth estimation and flow estimation. Multi-task learning (MTL) has been successfully used for jointly estimating some of these tasks. Previous work was focused on utilizing appearance cues. In this paper, we address the gap of incorporating motion cues in a multi-task learning system. We propose ...
متن کاملMulti-Modal Multi-Task Deep Learning for Autonomous Driving
Several deep learning approaches have been applied to the autonomous driving task, many employing end-toend deep neural networks. Autonomous driving is complex, utilizing multiple behavioral modalities ranging from lane changing to turning and stopping. However, most existing approaches do not factor in the different behavioral modalities of the driving task into the training strategy. This pap...
متن کاملMODNet: Moving Object Detection Network with Motion and Appearance for Autonomous Driving
For autonomous driving, moving objects like vehicles and pedestrians are of critical importance as they primarily influence the maneuvering and braking of the car. Typically, they are detected by motion segmentation of dense optical flow augmented by a CNN based object detector for capturing semantics. In this paper, our aim is to jointly model motion and appearance cues in a single convolution...
متن کاملDriving Scene Perception Network: Real-time Joint Detection, Depth Estimation and Semantic Segmentation
As the demand for enabling high-level autonomous driving has increased in recent years and visual perception is one of the critical features to enable fully autonomous driving, in this paper, we introduce an efficient approach for simultaneous object detection, depth estimation and pixel-level semantic segmentation using a shared convolutional architecture. The proposed network model, which we ...
متن کاملEnd-to-end Multi-Modal Multi-Task Vehicle Control for Self-Driving Cars with Visual Perception
Convolutional Neural Networks (CNN) have been successfully applied to autonomous driving tasks, many in an endto-end manner. Previous end-to-end steering control methods take an image or an image sequence as the input and directly predict the steering angle with CNN. Although single task learning on steering angles has reported good performances, the steering angle alone is not sufficient for v...
متن کامل